Skip to content

Better SQL for to-one joins#37819

Open
roji wants to merge 1 commit intodotnet:mainfrom
roji:SplitQueryReferenceJoin
Open

Better SQL for to-one joins#37819
roji wants to merge 1 commit intodotnet:mainfrom
roji:SplitQueryReferenceJoin

Conversation

@roji
Copy link
Member

@roji roji commented Mar 1, 2026

  • Stop adding to-one-joined entity keys to query identifiers
  • Prune unneeded to-one JOINs

Closes #29182

@roji roji force-pushed the SplitQueryReferenceJoin branch from 184531e to eeaf668 Compare March 2, 2026 08:19
* Stop adding to-one-joined entity keys to query identifiers
* Prune unneeded to-one JOINs

Closes dotnet#29182
@roji roji force-pushed the SplitQueryReferenceJoin branch from eeaf668 to 4a3ef03 Compare March 2, 2026 09:03
@roji roji marked this pull request as ready for review March 2, 2026 16:14
@roji roji requested a review from a team as a code owner March 2, 2026 16:14
Copilot AI review requested due to automatic review settings March 2, 2026 16:14
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves query SQL generation by avoiding redundant identifier expansion and enabling pruning of unnecessary to-one reference joins, especially benefiting split queries (issue #29182).

Changes:

  • Treat single-result/to-one joins as not increasing cardinality, so inner identifiers aren’t added to the outer query identifier.
  • Detect to-one joins via join predicates and mark corresponding LEFT JOINs as prunable.
  • Update many SQL assertion baselines across SqlServer/Sqlite tests to reflect fewer JOINs, projections, and ORDER BY columns.

Reviewed changes

Copilot reviewed 60 out of 60 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/EFCore.Relational/Query/SqlExpressions/SelectExpression.cs Adds to-one join awareness to identifier propagation and marks certain LEFT JOINs as prunable; introduces predicate-based to-one detection.
test/EFCore.Sqlite.FunctionalTests/BulkUpdates/NorthwindBulkUpdatesSqliteTest.cs Updates SQL baselines to remove now-pruned LEFT JOIN subqueries/parameters.
test/EFCore.Sqlite.FunctionalTests/BulkUpdates/NonSharedModelBulkUpdatesSqliteTest.cs Updates SQL baseline to remove redundant LEFT JOIN.
test/EFCore.SqlServer.FunctionalTests/BulkUpdates/NorthwindBulkUpdatesSqlServerTest.cs Updates SQL baselines to remove now-pruned LEFT JOIN subqueries/parameters.
test/EFCore.SqlServer.FunctionalTests/BulkUpdates/NonSharedModelBulkUpdatesSqlServerTest.cs Updates SQL baseline to remove redundant LEFT JOIN.
test/EFCore.SqlServer.FunctionalTests/Query/*.cs (multiple) Updates many SQL baselines (fewer projected key columns, fewer JOINs, simplified ORDER BY).

Span<bool> matched = innerIdentifiers.Count <= 8
? stackalloc bool[innerIdentifiers.Count]
: new bool[innerIdentifiers.Count];

Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AllInnerIdentifiersInPredicate uses stackalloc bool[...] for the matched buffer, but stackalloc memory isn't guaranteed to be zero-initialized. That can cause random true values, leading to false positives (incorrectly treating a join as to-one and skipping identifiers / pruning joins). Initialize the span (e.g., matched.Clear() right after allocation) or avoid stackalloc here.

Suggested change
matched.Clear();

Copilot uses AI. Check for mistakes.
Comment on lines 3172 to +3191
if (_identifier.Count > 0 && innerSelect._identifier.Count > 0)
{
switch (joinType)
if (!isToOneJoin)
{
case JoinType.LeftJoin or JoinType.OuterApply:
_identifier.AddRange(innerSelect._identifier.Select(e => (e.Column.MakeNullable(), e.Comparer)));
break;
switch (joinType)
{
case JoinType.LeftJoin or JoinType.OuterApply:
_identifier.AddRange(innerSelect._identifier.Select(e => (e.Column.MakeNullable(), e.Comparer)));
break;

case JoinType.RightJoin:
var nullableOuterIdentifier = _identifier.Select(e => (e.Column.MakeNullable(), e.Comparer)).ToList();
_identifier.Clear();
_identifier.AddRange(nullableOuterIdentifier);
_identifier.AddRange(innerSelect._identifier);
break;
case JoinType.RightJoin:
var nullableOuterIdentifier = _identifier.Select(e => (e.Column.MakeNullable(), e.Comparer)).ToList();
_identifier.Clear();
_identifier.AddRange(nullableOuterIdentifier);
_identifier.AddRange(innerSelect._identifier);
break;

default:
_identifier.AddRange(innerSelect._identifier);
break;
default:
_identifier.AddRange(innerSelect._identifier);
break;
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The isToOneJoin optimization currently skips all identifier merging, including the special JoinType.RightJoin handling. For a right join, leaving _identifier as the outer identifiers can be incorrect since the preserved side is the inner; this may break row-identity/order-by requirements. Consider excluding RightJoin from the isToOneJoin shortcut, or explicitly setting identifiers for RightJoin (e.g., ensure inner identifiers remain the identifier set when the join is to-one).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve generated SQL for reference joins

2 participants